A Methodology of Error Detection: Improving Speech Recognition in Radiology
نویسنده
چکیده
Automated speech recognition (ASR) in radiology report dictation demands highly accurate and robust recognition software. Despite vendor claims, current implementations are suboptimal, leading to poor accuracy, and time and money wasted on proofreading. Thus, other methods must be considered for increasing the reliability and performance of ASR before it is a viable alternative to human transcription. One such method is post-ASR error detection, used to recover from the inaccuracy of speech recognition. This thesis proposes that detecting and highlighting errors, or areas of low confidence, in a machine-transcribed report allows the radiologist to proofread more efficiently. This, in turn, restores the benefits of ASR in radiology, including efficient report handling and resource utilization. To this end, an objective classification of error-detection methods for ASR is established. Under this classification, a new theory of error detection in ASR is derived from the hybrid application of multiple error-detection heuristics. This theory is contingent upon the type of recognition errors and the complementary coverage of the heuristics. Inspired by these principles, a hybrid error-detection application is developed as proof of concept. The algorithm relies on four separate artificial-intelligence heuristics together covering semantic, syntactic, and structural error types, and developed with the help of 2700 anonymised reports obtained from a local radiology clinic. Two heuristics involve statistical modeling: pointwise mutual information and co-occurrence analysis. The remaining two are non-statistical techniques: a property-based, constraint-handling-rules grammar, and a conceptual distance metric relying on the ontological knowledge in the Unified Medical Language System. When the hybrid algorithm is applied to thirty real-world radiology reports, the results are encouraging: up to a 24% increase in the recall performance and an 8% increase in the precision performance over the best single technique. In addition, the resulting algorithm is efficient and modular.
منابع مشابه
Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملStudy of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition
This paper studies the overlapped speech detection for improving the performance of the summed channel speaker recognition system in NIST Speaker Recognition Evaluation (SRE). The speaker recognition system includes four main modules: voice activity detection, speaker diarization, overlapped speaker detection and speaker recognition. We adopt a GMM based overlapped speaker detection system, by ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006